Meta Launches Innovative Model AU-Nets to Revolutionize Text Processing
The Meta research team introduced an innovative architecture called AU-Net, breaking through the limitations of traditional tokenization techniques. The model learns directly from bytes, dynamically combining bytes into hierarchical sequence representations through a self-regressive U-Net structure. Its contracting path compresses bytes into semantic units, while the expanding path restores information and fuses details, with skip connections preserving local features. A multi-stage pooling strategy is used to extract semantics at the word and phrase levels, and multi-linear upsampling optimizes information fusion. The self-regressive generation mechanism ensures text coherence and significantly improves inference efficiency. This architecture